Dear Andrew,
First of all, thank you so much for all the work you put into your blog. It’s the best resource I have found for the sort of modeling questions I encounter in my work.
If I may, I have a question concerning the definitions of “marginal” vs. “conditional” effects. I have encountered what would appear to be at least three alternative definitions:
In your post “Marginalia”, you use the “slider” vs. “switch” analogy; marginal effects are partial derivatives for continuous variables, while conditional effects are discrete “deltas” associated with factor levels. Fair enough — a good, clean definition.
In your more recent post, you discuss how the terms marginal and conditional take on a different meaning in the context of hierarchical modeling. If I understand correctly, conditional effects are based solely on the fixed terms of the model, while marginal effects (somehow) incorporate the uncertainty arising from the random effects.
In the documentation of ggeffects, Daniel Lüdecke
offers what would seem to be yet another definition: “[The] effects
returned by ggpredict() can be described as conditional effects
(i.e. these are conditioned on certain levels of factors), while
ggemmeans() and ggeffect() return marginal means, since the effects
are”marginalized” (or “averaged”) over the levels of factors.”
I’d be very grateful for any light you can shed on this.
Best,
Doug
Hello!
Unfortunately all three definitions are true 😭
So marginal effects can refer to little changes in a slider (partial derivatives, or differential calculus), and also to “marginalizing” and averaging across variables and finding marginal means, like the {ggeffects} definition (averages, or integral calculus). The same word means opposite concepts(!). PLUS in the multilevel model paradigm, it refers to dealing with (or not dealing with) random effects.
It’s all a horrible mess.
Andrew Heiss
Marginal and conditional effects are about what we do with models once we’ve gone through all the trouble of making them. All this depends, of course, on understanding how your model works in the first place, and how it relates to the world that it supposedly represents.
For us ecologists, working through all this is a healthy exercise.
Our workflow typically goes something like this. You begin with an
understanding of the world that inspires you to measure things. Having
gotten your measurements, you place your understanding of the world
safely on a shelf somewhere and proceed to the “real work” of doing
stats with your measurements. Pretty soon you realize that those t-tests
you learned in stats class aren’t going to help. Your adviser shows you
how to use aov() to produce tables that you don’t
understand. Then somebody introduces you to the lm()
function and says that aov() is totally passé, something
about effect sizes, the 1990s, Nirvana, etc. You nod politely and get to
work. Just when you finish saving
analysis_final_final_v2.r, somebody asks you why you’re
fitting a Gaussian model to count data. The real question, you think to
yourself, is why I’m doing this PhD program…but you smile and nod and
try glm() instead. Now your model is spitting out
completely different numbers, but some of the p-values still
look alright, so you breathe a sigh of relief and start writing your
results section. Then your adviser, who has just now looked at that
preliminary report you sent 3 months ago, asks why you didn’t include
region as a covariate in your model, because that’s how he did it in
that Ecology paper from 1995. This triggers painful flashbacks
about aov() and the Backstreet Boys, but you recover and
ask Google what to do with categorical variables. A jolly looking
bearded chap named Ben Bolker keeps writing about “mixed effects models”
and some package called lme4, so, with a
glmer() of hope, you turn to
analysis_final_final_v13.rmd (because somebody told you
should try RMarkdown) and figure where the | symbol is on your keyboard.
You press ctrl-enter, and your world falls apart. Whatever “convergence”
means, it’s not happening. The summary table gives you the impression
that aov() is taking revenge for all the bad things you’ve
said about it. Worst of all, the p-values are just gone,
vanished without a trace. You begin hallucinating. Four months later,
sunshine and birdsong wake you from your stats coma. Spring has come
again. Gandalf and Sam are standing by your bedside. To your surprise,
you look at
analysis_final_final_v37_fuckfuck_vfsebhvfehkbfsdjkhbjnlmk.qt
(Quarto evidently came along during your coma) and discover what looks
like a working model. It has p-values. It has diagnostic plots that
aren’t so bad. In your delirium, you made beautiful plots with
gracefully sweeping curves and stout, self-assured boxplots. Gingerly,
you take your understanding of the world from its shelf, blow off a
layer of dust, and begin to write…
Okay, maybe that’s a bit of an exaggeration. Maybe your experience hasn’t been as traumatic as mine. But the point is that understanding marginal and conditional effects will require a deep dive into things you may have never fully understood about your models. Hang on tight.
Please note that throughout this demonstration, I will be ignoring completely the issue of causal inference, which is the most important thing about any statistical analysis (at least in ecology). The point is just to illustrate the behavior of models.
# Data
library(palmerpenguins)
# Frequentist modeling and visualization
library(lme4)
library(mgcv)
library(ggeffects)
# Bayesian modeling and visualization
library(brms)
library(tidybayes)
library(ggdist)
# Post-hoc analyses
library(emmeans)
library(marginaleffects)
# Handling and visualization
library(tidyverse)
library(ggthemes)
library(tidymodels)
library(modelsummary)
library(modelr)
library(see)
You might be wondering what the palmerpenguins dataset
is about. If you guessed penguins, you’re right. After reading the data
in from the palmerpenguins package, we have a data frame
called penguins. We need to make one quick modification. In
the original data, the variable year* is coded as an
integer column. We want to change year to a factor column
so that we can treat it as a discrete rather than continuous variable,
and we can do this in one line with a call to
dplyr::mutate.
data("penguins", package = "palmerpenguins") # read in data from package
penguins <- penguins %>%
mutate(year = factor(year)) # convert `year` to factor
datasummary_skim(penguins, type = "categorical")
| N | % | ||
|---|---|---|---|
| species | Adelie | 152 | 44.2 |
| Chinstrap | 68 | 19.8 | |
| Gentoo | 124 | 36.0 | |
| island | Biscoe | 168 | 48.8 |
| Dream | 124 | 36.0 | |
| Torgersen | 52 | 15.1 | |
| sex | female | 165 | 48.0 |
| male | 168 | 48.8 | |
| year | 2007 | 110 | 32.0 |
| 2008 | 114 | 33.1 | |
| 2009 | 120 | 34.9 |
datasummary_skim(penguins, type = "numeric")
| Unique (#) | Missing (%) | Mean | SD | Min | Median | Max | ||
|---|---|---|---|---|---|---|---|---|
| bill_length_mm | 165 | 1 | 43.9 | 5.5 | 32.1 | 44.5 | 59.6 | |
| bill_depth_mm | 81 | 1 | 17.2 | 2.0 | 13.1 | 17.3 | 21.5 | |
| flipper_length_mm | 56 | 1 | 200.9 | 14.1 | 172.0 | 197.0 | 231.0 | |
| body_mass_g | 95 | 1 | 4201.8 | 802.0 | 2700.0 | 4050.0 | 6300.0 |
Before we talk about models, I think it is worth mentioning that we don’t always need them. Let’s say someone has claimed that the sex ratios of penguins changed dramatically in response to the 2008 financial crisis. In this case, there is nothing a model can tell you that isn’t already obvious in a simple visualization of the data, and it would be silly to try fitting a binomial regression. This is a trivial example, but there are real questions in ecology that are just as trivial.
ggplot(filter(penguins, !is.na(sex)), aes(year, fill = sex)) +
geom_bar(position = "fill") +
scale_fill_solarized() +
facet_wrap(~species) +
theme_lucid()
In all but the simplest cases, though, merely visualizing the data is insufficient and potentially misleading.
Let’s say we are interested in understanding bill depth
as a linear function of body mass. We’re finally going to
settle the classic question of whether big birds have big beaks. So, we
plot bill depth ~ body mass and add a convenient best-fit
line with geom_smooth. Lo and behold, there would appear to
be an exciting, counter-intuitive finding. Write it up for
Nature!
ggplot(penguins, aes(body_mass_g, bill_depth_mm)) +
geom_point() +
geom_smooth(method = "lm") +
theme_lucid()
Again, this is a crude example, and it would be obvious to anybody
who doesn’t work for Morgan
Stanley that there is something very wrong with that smooth line.
But much subtler versions of this problem happen all the time in
ecology. In this particular case, we can make the data visualization
much less misleading simply by adding species to the
plot.
ggplot(penguins, aes(body_mass_g, bill_depth_mm, color = species)) +
geom_point() +
geom_smooth(method = "lm") +
scale_color_solarized() +
theme_lucid()
But what if flipper length also has something to do with
bill depth? This becomes a harder problem to solve with
mere visualization. Adding an alpha aesthetic doesn’t help much.
We need a model.
ggplot(penguins, aes(body_mass_g, bill_depth_mm, color = species, alpha = flipper_length_mm)) +
geom_point() +
geom_smooth(method = "lm") +
scale_color_solarized() +
theme_lucid()
Consider the model below.
mod.01 <- lm(bill_depth_mm ~ body_mass_g + flipper_length_mm + species,
data = penguins)
It’s one way to express the question we just asked: how does
bill depth relate to body mass,
flipper length, and species? When we specify a
linear model like this, we are saying that the bill depth of any given
penguin \(i\) falls within a normal
distribution around a mean of \(\mu\)
with a standard deviation of \(\sigma\).
\[ bill.depth \sim \mathcal{N}(\mu, \sigma) \] \(\mu\) is called the linear predictor, because it is a linear function that predicts the mean of the response variable. In our model, \(\mu\) takes the following form:
\[ \mu = \beta_0 + \beta_1body.mass + \beta_2flipper.length + \beta_3species.Chinstrap + \beta_4species.Gentoo \]
To be honest, this is about where my mathematical understanding of linear regression ends. There are magical algorithms that use things like calculus and linear algebra to estimate each of the \(\beta\) parameters (along with the standard deviation \(\sigma\)) from the data. Rather than going into how they are calculated, let’s focus on what the parameters mean.
\(\beta_0\): The value of the
response variable when body mass and
flipper length equal zero and species equals
the reference level of Adelie. If you’re trying to imagine an Adelie
penguin with zero body mass, you’ve already noticed an inherent
limitation of linear modeling. But that is what the parameter as such
means. As long as we do not try to use the model to predict outcomes for
unrealistic values of the predictor variables, the model can still be
useful. This is a helpful reminder that every model can be
reduced to absurdity, because models of the world are not the world.
Anyway, moving along…
\(\beta_1\): The slope of
bill depth body mass, conditional on
flipper length being set to its mean and
species being set to its reference level (Adelie) — but in
a model without interactions, the assumption is that this slope is
constant across all covariate values. This slope is — drum roll — the
marginal effect of body mass on
bill depth.
\(\beta_2\): The slope of
bill depth ~ flipper length, conditional on
body mass being set to its mean and species
being set to its reference level (Adelie). Again, in a model without
interactions, the assumption is that this slope is constant across all
covariate values. This slope is the marginal effect of
flipper length on bill depth.
\(\beta_3\): The change in
bill depth when you go from species = Adelie
to species = Chinstrap, conditional on
flipper length and body size being set to
their means. This difference is the conditional effect
of species = Chinstrap on bill depth.
\(\beta_4\): The change in
bill depth when you go from species = Adelie
to species = Gentoo, conditional on
flipper length and body size being set to
their means. This difference is the conditional effect
of species = Gentoo on bill depth.
Here we finally encounter the terms that this R Club is all about: marginal and conditional effects. A marginal effect is the slope of the response variable in relation to a continuous predictor (conditional on all covariates), while a conditional effect is the discrete change of the response variable in relation to a categorical variable (again, conditional on all covariates).
In ecology, these \(\beta\)
parameters become the focus of our interpretation, which is almost
invariably causal. As I said, we are not focusing on causal
inference in this workshop, but when we use linear regression in our
research, we interpret marginal and conditional effects as
causal effects, meaning that intervening in the system to
change, say, body mass, would cause bill depth
to change at a slope of \(\beta_1\).
Let’s take a look at the parameter estimates that the
mod.01 gives us. We’ll use tidymodels to
extract the model estimates into a nice clean tibble, but the same
information can be obtained with summary(). The
estimate column contains our \(\beta\) parameters, which, as we discussed
above, are the marginal and conditional effects of each of our predictor
variables. We can calculate 95% confidence intervals for each estimate
by adding +/- 2x the standard error. For convenience, we’ll also make a
column that distinguishes marginal effects, conditional effects, and the
intercept.
mod.01_estimates <- tidy(mod.01) %>%
mutate(upper.95 = estimate + 2*std.error,
lower.95 = estimate + -2*std.error) %>%
mutate(class = case_when(
term == "(Intercept)" ~ "Intercept",
term %in% c("body_mass_g", "flipper_length_mm") ~ "Marginal effects",
term %in% c("speciesChinstrap", "speciesGentoo") ~ "Conditional effects"
))
mod.01_estimates
## # A tibble: 5 × 8
## term estimate std.error statistic p.value upper.95 lower.95 class
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 (Intercept) 7.72 1.43 5.39 1.36e- 7 10.6 4.86e+0 Inte…
## 2 body_mass_g 0.00124 0.000125 9.93 1.45e-20 0.00149 9.92e-4 Marg…
## 3 flipper_length_… 0.0317 0.00870 3.65 3.09e- 4 0.0491 1.43e-2 Marg…
## 4 speciesChinstrap -0.152 0.135 -1.13 2.61e- 1 0.118 -4.23e-1 Cond…
## 5 speciesGentoo -5.94 0.221 -26.8 1.54e-85 -5.49 -6.38e+0 Cond…
Let’s visualize these marginal and conditional effects.
ggplot(filter(mod.01_estimates, class != "Intercept"),
aes(term, estimate,
ymin = lower.95,
ymax = upper.95)) +
geom_point() +
geom_errorbar(width = 0.25) +
facet_wrap(~class, scales = "free") +
theme_lucid()
The conditional effect of species = Chinstrap (\(\beta_3\)) is negative but close to zero,
with it’s 95% confidence interval including positive values. At alpha =
0.05, then, we would all this effect non-significant. All else held
equal, Adelie and Chinstrap penguins have similar
bill depth.
The conditional effect of species = Gentoo (\(\beta_4\)), however, is around -6, with a
95% confidence interval extending from around 5.5 to around 6.4. This
means that bill depth decreases by about 6 mm when you go
from species = Adelie to species = Gentoo.
The marginal effect of body size (\(\beta_1\)) is around 0.001, but with a very
tight confidence interval that makes it significantly positive at alpha
= 0.05. Remember, the units of the predictor will determine the scale of
the \(\beta\) coefficient. Since we’re
measuring body mass in grams, it is not surprising that the slope of
bill depth ~ body mass is small. It might make sense to
rescale body mass to kg units, but we’ll leave it as it is
for now. For every 1 g increase in body mass, we expect a
0.001 mm increase in bill depth.
The marginal effect of flipper length (\(\beta_2\)) is around 0.03, significantly
positive at alpha = 0.05. This means that for ever 1 mm increase in
flipper length (again, we’re dealing with small units
here), we expect a 0.03 mm increase in bill depth.
In this relatively simple model, it is fairly easy to interpret the marginal and conditional effects directly, as we have done above. Often, though, it is more intuitive to visualize the predicted values of the response variable generated by the marginal and conditional effects. When we do this, we are working with marginal and conditional means, i.e. the predicted mean value of the response variable (with uncertainty) given specified values of the predictor variables. This is the most common (and probably the best) way to visualize the results of a multiple regression model.
Remember, a linear regression is literally just a mathematical
equation that can be solved given the values of your predictors. The
only tricky part is to keep track of the uncertainty associated with
each parameter estimate and propagate it appropriately to your
predictions. There are several ways to do this in R, but
for now we will just consider the simplest and most graphically-oriented
option: ggeffects.
The strength of ggeffects is that it is designed with
visualization in mind, so it only takes a few lines of code to yield
nice plots of your marginal/conditional effects. We start by generating
predictions with ggpredict(), which by default generates
marginal/conditional predictions for all the predictors used in your
model. The output is a list of data frames, one for each variable in
your model, containing a column of predictions and corresponding
confidence intervals.
mod.01_predictions <- ggpredict(mod.01)
mod.01_predictions
## $body_mass_g
## # Predicted values of bill_depth_mm
##
## body_mass_g | Predicted | 95% CI
## ----------------------------------------
## 2600 | 17.20 | [16.82, 17.58]
## 3000 | 17.70 | [17.40, 18.00]
## 3600 | 18.44 | [18.25, 18.64]
## 4000 | 18.94 | [18.77, 19.11]
## 4400 | 19.44 | [19.24, 19.64]
## 5000 | 20.18 | [19.88, 20.48]
## 5400 | 20.68 | [20.29, 21.07]
## 6400 | 21.92 | [21.31, 22.54]
##
## Adjusted for:
## * flipper_length_mm = 197.00
## * species = Adelie
##
## $flipper_length_mm
## # Predicted values of bill_depth_mm
##
## flipper_length_mm | Predicted | 95% CI
## ----------------------------------------------
## 170 | 18.15 | [17.73, 18.57]
## 180 | 18.46 | [18.19, 18.73]
## 190 | 18.78 | [18.62, 18.94]
## 200 | 19.10 | [18.90, 19.30]
## 210 | 19.42 | [19.08, 19.75]
## 220 | 19.73 | [19.24, 20.22]
## 230 | 20.05 | [19.40, 20.70]
## 240 | 20.37 | [19.55, 21.19]
##
## Adjusted for:
## * body_mass_g = 4050.00
## * species = Adelie
##
## $species
## # Predicted values of bill_depth_mm
##
## species | Predicted | 95% CI
## --------------------------------------
## Adelie | 19.00 | [18.83, 19.17]
## Chinstrap | 18.85 | [18.63, 19.07]
## Gentoo | 13.07 | [12.74, 13.39]
##
## Adjusted for:
## * body_mass_g = 4050.00
## * flipper_length_mm = 197.00
##
## attr(,"class")
## [1] "ggalleffects" "list"
## attr(,"model.name")
## [1] "mod.01"
If you want to make custom plots, you can always pull these data
frames out of the list and do whatever you want with them. For now,
though, we will use the default plotting functions built into
ggeffects, which are quite nice. Here are the marginal and
conditional means inferred from our model:
plot(mod.01_predictions)
## $body_mass_g
##
## $flipper_length_mm
##
## $species
Having seen the right way to visualize a multiple regression model, let’s enjoy the spectacle of the wrong way: plotting the raw data, plucking the p-values out of the model, and adding decorative asterisks. I still see this all the time in papers that I review, and it is an indication that the authors do not understand multiple regression.
Let’s say the focus of our study was the difference in
bill depth across species. Adelie and
Chinstrap penguins seem to have roughly the same
bill depth, but Gentoo penguins seem to have shallower
bills. But is this significant? Let’s dig into our model summary
table.
summary(mod.01)
##
## Call:
## lm(formula = bill_depth_mm ~ body_mass_g + flipper_length_mm +
## species, data = penguins)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.2561 -0.5615 -0.0444 0.5629 2.6971
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.723542 1.434214 5.385 1.36e-07 ***
## body_mass_g 0.001242 0.000125 9.934 < 2e-16 ***
## flipper_length_mm 0.031727 0.008702 3.646 0.000309 ***
## speciesChinstrap -0.152278 0.135188 -1.126 0.260791
## speciesGentoo -5.936424 0.221499 -26.801 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.8632 on 337 degrees of freedom
## (2 observations deleted due to missingness)
## Multiple R-squared: 0.8112, Adjusted R-squared: 0.8089
## F-statistic: 361.9 on 4 and 337 DF, p-value: < 2.2e-16
Ah, yes, indeed the effect of species = Gentoo is
significant but the effect of species = Chinstrap is not.
Let’s plot some boxplots and stick those asterisks on there.
ggplot(penguins, aes(species, bill_depth_mm)) +
geom_boxplot() +
annotate("text", x = 3, y = 20, label = "***") +
theme_lucid()
## Warning: Removed 2 rows containing non-finite values (`stat_boxplot()`).
Now, in this particular case, no great harm has been done. The problem, though, is that this throws away everything we learned from our model except the p-values. That’s like baking a fancy birthday cake, then throwing away everything except the candles. P-values by themselves mean very nearly nothing and should never be interpreted. Always and only interpret parameter estimates (and/or their predictions) with their corresponding uncertainty.
My guess is that what we have covered so far was perhaps mildly
uncomfortable but familiar. Some of you already have a lot of experience
fitting multiple regression models and visualizing them with
ggeffects. That’s good. Now you can articulate that
procedure in terms of marginal and conditional effects.
You probably have questions, though. What about post-hoc tests? Interactions? Models with nonlinearities? Random effects?
Stand by.
As models become more complex, parameter estimates alone become hard or impossible to interpret, and the notion of marginal and conditional effects becomes more complicated (and eventually contradictory). Let’s consider two kinds of models that we use a lot in ecology: GAMs and interaction models.
None of the variables in the penguins data set have curvy
relationships, so we’ll have to cheat a bit. If we take only the
Chinstrap and Adelie species and plot body mass against
bill_length, ignoring species, we get a curvy shape. Again,
this is just to give us something to fit a GAM to.
data <- penguins %>%
filter(species %in% c("Adelie", "Chinstrap")) %>%
select(bill_length_mm, body_mass_g)
ggplot(data, aes(body_mass_g, bill_length_mm)) +
geom_point() +
geom_smooth() +
theme_lucid()
We’ll fit a very simple GAM in which bill length is
smooth function of body mass. Please don’t take this as an
example of how to fit GAMs. The goal is just to demonstrate a
problem.
mod.02 <- gam(bill_length_mm ~ s(body_mass_g),
data = data)
We tidy the model up, and right away we see the problem. Where is the \(\beta\) coefficient? How are we supposed to get a marginal effect out of this model?
mod.02_estimates <- tidy(mod.02)
mod.02_estimates
## # A tibble: 1 × 5
## term edf ref.df statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 s(body_mass_g) 1.91 2.41 10.4 0.0000192
If we visualize the model, the nature of the problem becomes obvious.
The slope of bill length ~ body mass is not a constant; it
changes depending on where you are along the x-axis of
body mass. What could the marginal effect of
body mass even mean in a model like this?
mod.02_predictions <- ggpredict(mod.02)
plot(mod.02_predictions)
## $body_mass_g
Hold that thought while we consider another problem.
Now consider a slightly more serious model. Let’s say we’re
reconsidering mod_01, and we really think it should include
an interaction between our factor species and our
continuous variables flipper length. For simplicity, we’ll
drop body mass from this model.
mod.03 <- lm(bill_depth_mm ~
flipper_length_mm +
species +
flipper_length_mm:species,
data = penguins)
Tidy it up and inspect the parameter estimates. We have a \(\beta\) coefficient for
flipper length, \(\beta\)
coefficients for the species levels of Chinstrap and
Gentoo, and then \(\beta\) coefficients
for flipper length : Chinstrap and
flipper length : Gentoo. Can we understand these \(\beta\) coefficients as
marginal/conditional effects?
mod.03_estimates <- tidy(mod.03)
mod.03_estimates
## # A tibble: 6 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 7.47 2.31 3.24 0.00131
## 2 flipper_length_mm 0.0572 0.0121 4.72 0.00000350
## 3 speciesChinstrap -7.14 3.99 -1.79 0.0747
## 4 speciesGentoo -15.7 3.74 -4.20 0.0000344
## 5 flipper_length_mm:speciesChinstrap 0.0351 0.0206 1.71 0.0890
## 6 flipper_length_mm:speciesGentoo 0.0497 0.0182 2.73 0.00667
Let’s look at them one at a time.
\(\beta_0\): Just like in
mod.01, the intercept represents bill_depth
when species = Adelie and
flipper length = 0.
\(\beta_1\): This is the slope of
bill depth ~ flipper length, but there’s a catch. In
mod.01, which did not have interactions, this slope was
assumed to be the same across all values of the covariates. In this
interaction model, that is no longer true; this parameter is the slope
of bill depth ~ flipper length only for
species = Adelie. We can no longer consider this a marginal
effect the way we did in mod.01.
\(\beta_2\): As in
mod01, this is the change in bill depth when
you go from species = Adelie to
species = Chinstrap, conditional on
flipper length being set to its mean. This difference is
the conditional effect of
species = Chinstrap on bill depth.
\(\beta_3\): Same as above, but for Gentoo.
Now things get tricky.
\(\beta_4\): This is the value that
gets added to \(\beta_1\) to
give you the slope of bill depth ~ flipper length given
species = Chinstrap.
\(\beta_5\): This is the value that
gets added to \(\beta_1\) to
give you the slope of bill depth ~ flipper length given
species = Gentoo.
This is problematic. We have conditional effects for
species, but none of the \(\beta\)
coefficients can be interpreted as a marginal effect of
flipper length. What do we do?
Now let’s consider one more kind of model that complicates our notion
of marginal effects: a mixed-effect model. Returning to our penguin
data, let’s use species as a random intercept rather than a
fixed effect. Again, for simplicity flipper length will be
the only continuous variable we consider.
The math of a mixed effects model looks like this:
\[ bill.depth \sim \mathcal{N}(\mu_j, \sigma_y) \]
The subscript in \(\mu_j\) tells us
that \(\mu\) varies with \(j\), which in our model is
species. Specifically, the intercept \(\beta_0\) gets a value \(b_{0_j}\) added to it for each level of
species.
\[ \mu = (\beta_0 + b_{0_j}) + \beta_1flipper.length \]
The value added to the intercept to represent species-level variation is drawn from a normal distribution with mean = 0 and a standard deviation estimated from the data:
\[b_{0_j} \sim \mathcal{N}(0, \sigma_0)\]
Honestly, I’m shaky on the math here, but it’s close enough for our purposes. Let’s fit the model.
mod.04 <- lmer(bill_depth_mm ~ flipper_length_mm + (1 | species),
data = penguins)
Unfortunately, we cannot use tidy() on a mixed effects
model, but we can use the lovely modelsummary() function
from marginaleffects to get a neat overview of our
parameter estimates.
modelsummary(mod.04)
| (1) | |
|---|---|
| (Intercept) | 0.829 |
| (2.415) | |
| flipper_length_mm | 0.082 |
| (0.008) | |
| SD (Intercept species) | 3.117 |
| SD (Observations) | 0.980 |
| Num.Obs. | 342 |
| R2 Marg. | 0.110 |
| R2 Cond. | 0.920 |
| AIC | 988.6 |
| BIC | 1003.9 |
| ICC | 0.9 |
| RMSE | 0.97 |
What can we make of this? Well, the marginal effect
of flipper length is simple enough — it’s just \(\beta_1\), the same as in
mod.01. But what if we want marginal
means? How do we account for the variation in the random
effects term \(b_{0_j}\)? Or should we
just ignore it?
ggeffects gives us a couple of options. The default
behavior is to ignore the random effect and plot the predictions based
only on the fixed terms. But we can also choose to incorporate the
uncertainty arising from the random effect. Let’s see how this affects
our results.
# Fixed only
mod.04_predictions_fixed <- ggpredict(mod.04)$flipper_length_mm %>%
tibble() %>%
mutate(model = "fixed")
# With random
mod.04_predictions_random <- ggpredict(mod.04, type = "re")$flipper_length_mm %>%
tibble() %>%
mutate(model = "random")
# Collate
mod.04_predictions <- bind_rows(mod.04_predictions_fixed,
mod.04_predictions_random) %>%
mutate(model = factor(model, levels = c("random", "fixed")))
First, notice that the fit lines for the two types of prediction are identical. This is because the line is based solely on \(\beta_1\) in both cases. What differs is the uncertainty around this estimate. When we ignore the random effect, we get a tighter confidence interval. In this case, because the random effect happens to be weak, the difference is very small, but it can be much more pronounced in models with stronger random effects.
ggplot(mod.04_predictions, aes(x, predicted,
ymin = conf.low,
ymax = conf.high,
fill = model)) +
geom_ribbon(alpha = 0.5) +
geom_line() +
labs(x = "flipper_length_mm", y = "bill_depth_mm") +
theme_lucid()
Which is the right choice? And which one can be called a “marginal mean”?
We need to revisit the definitions of marginal and conditional effects in light of the models presented above. If it’s not still fresh in your mind, reread that email correspondence that I had with Andrew Heiss. The unfortunate truth is that there are at least three mutually incompatible definitions of marginal and conditional effects floating around the world of statisticians and practitioners. We can’t fix the terminology, but we can understand the different meanings that the terms can have, and how they relate to different types of models.
1. Sliders vs. switches
This is the definition I introduced with mod.01: a
marginal effect is the (partial) slope on the response variable on a
continuous predictor (e.g. bill_depth ~ flipper length),
and a conditional effect is the change in the response variable when a
categorical variable goes from its reference level
(e.g. species = Adelie) to some other level
(e.g. species = Gentoo). This, I think, happens to be the
classical definition that real statisticians usually have in mind. For
an excellent overview of this sense of marginal and conditional effects,
see Andrew Heiss’ “Marginalia”.
2. Conditioning vs. averaging
This definition comes into play when your model has interactions (or
random slopes). Under this definition, a marginal effect averages \(\beta\) parameters over an interacting
covariate, while a conditional effect fixes the interacting covariate at
a specified level. For example, in mod.03, the marginal
effect of flipper length would be the average (or
potentially some other summary) of the species-specific slopes, thus
representing the slope expected for a hypothetical species that is the
average of the three observed species. The conditional effect of
flipper length would be the slope given a specific level of
species, say species = Chinstrap. Thus, in that model,
there would be only one marginal effect of flipper length
but three conditional effects, one for each level of
species.
3. Incorporating vs. ignoring random effects
Finally, in the context of mixed effect models, the terms “marginal” and “conditional” take on yet a third meaning with respect to predictions. Marginal means are based on predictions that incorporate the uncertainty arising from the random effects component of the model, while conditional means are based on predictions that use only the fixed terms of the model. Thus, conditional means will generally have tighter confidence intervals that marginal means. When it comes to interpretation, a marginal mean can be understood as the expected value for a randomly selected penguin that belongs to one of the 3 species, but you don’t know which one. This uncertainty is reflected in the wider confidence interval, which has to cover the full range of possibility comprised by Adelie, Chinstrap, and Gentoo penguins. In contrast, a conditional mean can be understood as the expected value for a hypothetical average penguin, an Adelchintoo, if you will.
Each of these definitions is, by itself, useful and legitimate. The problem, of course, is that they cannot be reconciled with each other. Whenever you use the term “marginal” or “conditional,” it is up to you to understand which of these definitions you are working with, to align this usage with the interpretation of your models, and to communicate all this clearly to your audience.
marginaleffects